Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructure metadata encoder to track deps precisely #35684

Merged
merged 22 commits into from
Aug 18, 2016

Conversation

nikomatsakis
Copy link
Contributor

This issue restructures meta-data encoding to track dependencies very precisely. It uses a cute technique I hope to spread elsewhere, where we can guard the data flowing into a new dep-graph task and ensure that it is not "leaking" information from the outside, which would result in missing edges. There are no tests because we don't know of any bugs in the old system, but it's clear that there was leaked data.

The commit series is standalone, but the refactorings are kind of "windy". It's a good idea to read the comment in src/librustc_metadata/index_builder.rs to get a feeling for the overall strategy I was aiming at.

In several cases, I wound up adding some extra hashtable lookups, I think primarily for looking up AdtDef instances. We could remove these by implementing DepGraphRead for an AdtDef and then having it register a read to the adt-defs table, I guess, but I doubt it is really noticeable.

Eventually I think I'd like to extend this pattern to the dep-graph more generally, since it effectively guarantees that data cannot leak.

Fixes #35111.

r? @michaelwoerister

//!
//! In addition to the offset, we need to track the data that was used
//! to generate the contents of each `data_item`. This is so that we
//! can figure out which HIR nodes contributors to that data for
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'contributed'

@michaelwoerister
Copy link
Member

I like it! I'm definitely in favor of spreading the use of this pattern, it should indeed reduce the number of data leaks a lot.

fn read(&self, tcx: TyCtxt);
}

impl DepGraphRead for usize {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why implement this? It seems open the door missing edges.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelwoerister

Why implement this? It seems open the door missing edges.

A good question. I wanted it for indices that we pass down (e.g., the index of a variant), but it'd be semantically better to have some newtyped usize I suppose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could always pass such instances as part of a FromId wrapper, right?

@nikomatsakis
Copy link
Contributor Author

@michaelwoerister and I chatted a bit on IRC and decided to remove the impl for usize and add an Untracked<T> wrapper instead, so that we can make local assertions that some particular usize is safe.

}
}
}

impl<'a, 'tcx, 'encoder> ItemContentBuilder<'a, 'tcx, 'encoder> {
/// Encode data for the given variant of the given ADT. The
/// indices of the variant is untracked: this is ok because we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indices => index

@michaelwoerister
Copy link
Member

r=me with the typos fixed.

@nikomatsakis
Copy link
Contributor Author

@bors r=mw

@bors
Copy link
Contributor

bors commented Aug 16, 2016

📌 Commit f0e5161 has been approved by mw

@nikomatsakis
Copy link
Contributor Author

@bors r=mw

@bors
Copy link
Contributor

bors commented Aug 17, 2016

📌 Commit 2f78c91 has been approved by mw

@jonas-schievink
Copy link
Contributor

src/librustc_metadata/encoder.rs:1416:13: 1416:44 error: unresolved name `encode_bounds_and_type_for_item` [E0425]

src/librustc_metadata/encoder.rs:1416             encode_bounds_and_type_for_item(rbml_w, self.ecx, self.index, ty.id);

                                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

src/librustc_metadata/encoder.rs:1416:13: 1416:44 help: run `rustc --explain E0425` to see a detailed explanation

src/librustc_metadata/encoder.rs:1407:9: 1407:28 error: the trait bound `index_builder::ItemContentBuilder<'a, 'tcx, 'encoder>: rustc::hir::intravisit::Visitor<'_>` is not satisfied [E0277]

src/librustc_metadata/encoder.rs:1407         intravisit::walk_ty(self, ty);

                                              ^~~~~~~~~~~~~~~~~~~

src/librustc_metadata/encoder.rs:1407:9: 1407:28 help: run `rustc --explain E0277` to see a detailed explanation

src/librustc_metadata/encoder.rs:1407:9: 1407:28 note: required by `rustc::hir::intravisit::walk_ty`

src/librustc_metadata/encoder.rs:1410:32: 1410:58 error: attempted access of field `rbml_w_for_visit_item` on type `&mut index_builder::ItemContentBuilder<'a, 'tcx, 'encoder>`, but no field with that name was found

src/librustc_metadata/encoder.rs:1410             let rbml_w = &mut *self.rbml_w_for_visit_item;

                                                                     ^~~~~~~~~~~~~~~~~~~~~~~~~~

src/librustc_metadata/encoder.rs:1412:25: 1412:35 error: attempted access of field `index` on type `&mut index_builder::ItemContentBuilder<'a, 'tcx, 'encoder>`, but no field with that name was found

src/librustc_metadata/encoder.rs:1412             let _task = self.index.record(def_id, rbml_w);

                                                              ^~~~~~~~~~

src/librustc_metadata/encoder.rs:1416:63: 1416:73 error: attempted access of field `index` on type `&mut index_builder::ItemContentBuilder<'a, 'tcx, 'encoder>`, but no field with that name was found

src/librustc_metadata/encoder.rs:1416             encode_bounds_and_type_for_item(rbml_w, self.ecx, self.index, ty.id);

                                                                                                    ^~~~~~~~~~

error: aborting due to 4 previous errors

@bors
Copy link
Contributor

bors commented Aug 17, 2016

☔ The latest upstream changes (presumably #35605) made this pull request unmergeable. Please resolve the merge conflicts.

@nikomatsakis
Copy link
Contributor Author

@jonas-schievink so weird; I do not see these locally (make check passes).

The idea is that, this way, we can cleanly isolate ALL state that is
being passed, since it goes as an argument to the fn pointer.
Also write a comment explaining the system.
The comment explains the `index-builder` pattern. We no longer need the
`Deref/DerefMut` relationship, and it seems nicer without it.
The idea is that a `usize` is sort of ambiguous: in this case, it
represents indices that do not need tracking, but it could as easily be
some data read out from a tracked location, and hence represent tracked
data. Therefore, we add an `Untracked` type that lets user assert
that value is not tracked.

Also correct various typos.
@nikomatsakis
Copy link
Contributor Author

@bors r=mw

@bors
Copy link
Contributor

bors commented Aug 17, 2016

📌 Commit 37d974f has been approved by mw

@bors
Copy link
Contributor

bors commented Aug 17, 2016

⌛ Testing commit 37d974f with merge 895ed0e...

@bors
Copy link
Contributor

bors commented Aug 17, 2016

💔 Test failed - auto-win-msvc-64-opt-rustbuild

@nikomatsakis
Copy link
Contributor Author

@bors retry

@nikomatsakis
Copy link
Contributor Author

/me hopes

@bors
Copy link
Contributor

bors commented Aug 18, 2016

⌛ Testing commit 37d974f with merge 499484f...

bors added a commit that referenced this pull request Aug 18, 2016
Restructure metadata encoder to track deps precisely

This issue restructures meta-data encoding to track dependencies very precisely. It uses a cute technique I hope to spread elsewhere, where we can guard the data flowing into a new dep-graph task and ensure that it is not "leaking" information from the outside, which would result in missing edges. There are no tests because we don't know of any bugs in the old system, but it's clear that there was leaked data.

The commit series is standalone, but the refactorings are kind of "windy". It's a good idea to read the comment in `src/librustc_metadata/index_builder.rs` to get a feeling for the overall strategy I was aiming at.

In several cases, I wound up adding some extra hashtable lookups, I think primarily for looking up `AdtDef` instances. We could remove these by implementing `DepGraphRead` for an `AdtDef` and then having it register a read to the adt-defs table, I guess, but I doubt it is really noticeable.

Eventually I think I'd like to extend this pattern to the dep-graph more generally, since it effectively guarantees that data cannot leak.

Fixes #35111.

r? @michaelwoerister
@bors bors merged commit 37d974f into rust-lang:master Aug 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants